Hierarchical Characterization of Complex Networks 

Luciano da Fontoura Costa and Filipi Nascimento Silva * 
February 2, 2008 



Abstract 

While the majority of approaches to the characteriza- 
tion of complex networks has relied on measurements 
considering only the immediate neighborhood of each 
network node, valuable information about the net- 
work topological properties can be obtained by con- 
sidering further neighborhoods. The current work 
discusses on how the concepts of hierarchical node 
degree and hierarchical clustering coefficient (intro- 
duced in cond-mat/0408076l, complemented by new 
hierarchical measurements, can be used in order to ob- 
tain a powerful set of topological features of complex 
networks. The interpretation of such measurements 
is discussed, including an analytical study of the hi- 
erarchical node degree for random networks, and the 
potential of the suggested measurements for the char- 
acterization of complex networks is illustrated with re- 
spect to simulations of random, scale-free and regular 
network models as well as real data (airports, proteins 
and word associations). The enhanced characteriza- 
tion of the connectivity provided by the set of hierar- 
chical measurements also allows the use of agglomer- 
ative clustering methods in order to obtain taxonomies 
of relationships between nodes in a network, a possi- 
bility which is also illustrated in the current article. 



1 Introduction 

Graph theory and statistical mechanics are well- 
established areas in mathematics and physics, respec- 
tively. Since its beginnings in the XVIII century, with 
the solution of the bridges problem by L. Euler, graph 
theory has progressed all the way to the forefront of 
theoretical and applied investigations in mathemat- 
ics and computer science. Much of the importance 
of this broad area stems from the generality of graphs 
as representational models. As a matter of fact, most 
discrete structures including matrices, trees, queues, 
among many others, are but particular cases of graphs. 
The potential of graphs is further extended by models 
where features are assigned to nodes, different types 
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of nodes and /or edges are allowed to co-exist, syn- 
chronization schemes are incorporated, and so on (see, 
for instance, [IJ). At the same time, statistical me- 
chanics, also drawing on a rich past of accomplish- 
ments, provides concepts and tools for bridging the 
gap between dynamics in the micro and macro realms. 
Of particular interest have been the investigations on 
phase transitions and complex systems, which repre- 
sent a major area of development today. 

While graph theory provides effective means for 
characterizing, modeling and simulating the structure 
of natural phenomena, statistical mechanics contains 
the methods for analyzing the dynamics of natural phe- 
nomena along several scales. The novel area of com- 
plex networks \ T. T'l can be understood as a fortunate 
intersection between those two major areas, therefore 
allowing a natural and powerful means for integrat- 
ing structure and dynamics. With origins extend- 
ing back to the pioneering developments of Flory |3|, 
Rapoport |4 1 and Erdos and Renyi 1 5 1, the area of com- 
plex networks was boosted more recently by the ad- 
vances by Watts and Strogatz 1 6 7J and Barabasi and 
collaborators |8|. 

Complex network investigations frequently involve 
the measurement of topological features of the ana- 
lyzed structures, such as the node degree (namely the 
number of edges attached to a node) and the clus- 
tering coefficient (quantifying the connectivity among 
the immediate neighbors of a node). Although degen- 
erated, in the sense that they do not allow a one-to-one 
identification of the possible network architectures, 
such a pair of measurements does provide a rich char- 
acterization of the connectivity of the networks. As a 
matter of fact, particularly interesting network mod- 
els, such as the small- world Q |21 17| |51 and scale-free 
(Barabasi- Albert) fT T "51, are characterized in terms 
of specific types of node degree distributions (logarith- 
mic and power-law, respectively). 

Although such distributions emphasize important 
properties of the analyzed networks, further valuable 
topological information can be gathered not only by 
considering the clustering coefficient, but also by ana- 
lyzing such features along the hierarchical levels of the 
networks 0^01. While some attention has been fo- 
cused on the relevant issue of hierarchy in complex 
networks (e.g. [Jl, 021 Il3l|l7||llil9,,20i^ 22. 22(131 



051 ESI 1271 1. and hierarchical extensions of the node 
degree and clustering coefficient were only more re- 
cently formalized in 1 9 10 1 by using concepts derived 
from mathematical morphology l25l l29l l30l includ- 
ing dilations and distance transforms in graphs. Despite 
their recent introduction, such concepts have already 
yielded valuable results when applied to essential- 
ity of protein-protein interaction networks |37|, bone 
structure characterization L38J . and community find- 
ing |3a|33|. 

The purpose of the current article is to review and 
further extend the concepts of hierarchical measure- 
ments, which is done by the consideration of the con- 
cepts of radial reference system and hierarchical common 
degree, as well as the introduction of the measurements 
of hierchical edge degree, inter-ring degree, intra-ring de- 
gree, convergence ratio, and emphedge clustering coef- 
ficient. The extensions of these measurement (exclud- 
ing the clustering coefficient) to weighted and directed 
networks are also described in this work. We start by 
presenting the basic concepts and discussing hierar- 
chies in complex networks in terms of virtual nodes and 
proceed by describing, interpreting and discussing the 
hierarchical measurements. An analytical character- 
ization of the general shape of the hierarchical node 
degree in random networks is also presented, and the 
potential of the reported concepts and methods is il- 
lustrated with respect to the characterization of simu- 
lated random, scale-free and regular network models. 
Such a potential is further illustrated with respect to 
real networks, including word associations, airports, 
and protein-protein interactions. Because the hierar- 
chical measurements provide a rich characterization 
of the connectivity around each network node, it be- 
comes possible to use clustering methods |15 14 1 in 
order to organize the nodes in a network into a tax- 
onomical scheme reflecting the similarities between 
their connectivity. This possibility is also illustrated 
in the present article. 



2 Notation and Basic Concepts 

Let the graph or network F of interest contain N nodes 
and e edges, and the connections between any two 
nodes i and j be represented as (i, j). Although non- 
oriented graphs are assumed henceforth, all reported 
concepts and methods can be immediately extended 
to digraphs and weighted networks. We henceforth 
assume the complete absence of loops (i.e. self- 
connections). A non-oriented graph can be completely 
specified in terms of its adjacency matrix K, with each 
connection [i,]) implying ii'(i, j) = K{j,i) = 1. The 
absence of a connection between nodes i and j is rep- 
resented as K{i, j) — K{j, i) — 0. Now, the node degree 




Figure 1 : Three situations yielding the same clustering 
coefficient (equal to 1) for the reference node i. 



k{i) of a node i of F can be defined as 

N N 
fc«=5]X(*,j) = ^X(j,z). (1) 

Observe that the degree of node i corresponds to the 
number of edges attached to that node, representing a 
direct measurement of the connectivity of that specific 
node. Indeed the overall connectivity of a specific net- 
work can be quantified in terms of its average node 
degree (fc) . While a random network is characterized 
by a typical average node degree with relatively low 
standard deviation, a scale-free model will present a 
power-law log-log distribution of node degrees, favor- 
ing the existence of hubs (i.e. nodes with high node 
degree). 

The clustering coefficient of a network node i can be 
defined as quantifying the connectivity among the im- 
mediate neighbors of i, which are henceforth repre- 
sented by the set Ri{i). More specifically, in case that 
node has ni{i) immediate neighbors (i.e., the cardi- 
nality of R{i)), implying a maximum number erii) = 
n-iini — l)/2 of connections between such nodes, and 
e{i) connections are observed among such neighbors, 
the clustering coefficient of i can be calculated as 

ccU) = ^ = 2 ^-^ . (2) 

erii) ni{i){ni{i) - 1) 

Observe that < cc{i) < 1, with the minimum and 
maximum values being achieved for complete absence 
of connections (for cc{i) — 0) and complete connectiv- 
ity among the neighbors of i (for cc{i) = 1). 

Although the clustering coefficient provides a pow- 
erful indication about the connectivity among the 
neighbors of the reference node, several different sit- 
uations (see Figure may yield the same clustering 
coefficient value (1 for these examples), which is a con- 
sequence of the fact that this measurement is relative 
to the total number of connections among the elements 
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Figure 2: A small network and a reference node i. The 
virtual edge between nodes i and j, one of the many 
of such a kind in this network, is represented by the 
dashed line. 



of S{i). Such situations can be distinguished by con- 
sidering the respective value of ni{i). 

3 Virtual Edges and Hierarchies 

Consider the situation depicted in Figure 121 where a 
reference node i = 1 is connected to several other 
network nodes. The set of immediate neighbors of 
i, hence Ri{i), is identified by the innermost ellip- 
sis. Observe that although no connection is observed 
between nodes i and j, information from the former 
node can propagate to the latter through the relay node 
r, which is indicated by the virtual edge shown as a 
dashed line. 

In the case of weighted networks, the virtual edges 
may take into accoimt the cumulative effect of the 
respective weights. For instance, in case we had in 
Figure 13 iy(i, r) = 3 and W{r,j) — 4, the weight 
of the virtual edge extending from i to j would be 
W(z,j) = (3)(4) = 12. 

The concept of virtual edge can be immediately ex- 
tended by considering further distances d from the ref- 
erence node. Such an extension can be naturally de- 
fined in terms of the weight matrix W representing the 
complex network of interest (observe that W = K for 
weightless networks). Let v{i) be a column vector with 
A'' elements equal to zero, except that at the i — th po- 
sition (recall that i is the label of the reference node), 
which is assigned unit value. Let the vector vi{i) be 
defined as 

vi{i) = Wvii), (3) 



and let the generalized Kronecker delta a = S{b) be 
the operator acting on a vector a in order to produce a 
vector b such that each element b{j) of b is one if and 
only a{j) is different from zero, and zero otherwise. 
By applying such operator on ^1(1) we obtain 

M^)^S{M^))- (4) 

The set of immediate neighbors of i, i.e. Ri{i), can 
now be obtained as corresponding to the indices of 
the elements of which are equal to 1. For exam- 
ple, we have for the situation depicted in Figure|3|that 
i?i(i = 8) = {2,5,7,9,12}. 

The above matrix framework can be extended to 
any neighborhood of i by introducing the vector Vd{i) 
defined as 

vS) = Wvii). (5) 

The weights of the virtual edges between i and the 
remainder network nodes at distance d are given by 
the successive entries of Vd, i.e. Wd{i,j) = Vd{j)- Ob- 
serve that the distance d between two nodes i and j is 
henceforth understood as corresponding to the num- 
ber of edges along the shortest path between those two 
nodes. 

The set of neighbors of i placed at distances varying 
from to d from the reference node i, henceforth rep- 
resented as Bd{i) and referred to as the ball of radius d 
centered at i, can be verified to correspond to the non- 
zero entries in the vector Pd{i) defined as follows 

Pd{z)=s(j2m+^i^)y 

For instance, the ball of radius 2 centered at i = 8 in 
Figure|2corresponds to the whole network in that fig- 
ure. Now, the set of network nodes which are exactly 
at distance d from the reference node i can be obtained 
as the unit entries in the vector 

rd{i) = Pd{i) - Pd-i{i)- (7) 

The set obtained from the above vector has also 
been called |10| the ring of radius d centered at i, 
being henceforth represented as Rd{i)- Observe that 
the ring of radius 2 centered at i = 8 in Figure 12 is 
ii'2(8) = {1,3,4,6,10,11,13}. 

The subnetwork defined by the nodes at a specific 
ring Rd{i)/ together with the edges between them, is 
henceforth represented as 7d(i). We are now ready 
to define the hierarchical level d of a complex network 
as corresponding to the nodes in 7^(1) and the edges 
extending from such nodes and the nodes in 7^+1(1). 
The two hierarchical levels of nodes existing in the net- 
work shown in Figure|2are identified by the inner and 
outermost ellipsis, respectively. Observe that the hier- 
archies d provide a radial reference frame or coordinate 
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system which can be used to partially identify nodes 
and edges with respect to the reference node i. The 
concept o hierarchy in a complex network is also re- 
lated to the concept of roles l34l and the distance trans- 
form f55"3ni of the nodes in the original network T 
with respect to the reference node 1 10|. 

Observe that statistics of the number of hierarchi- 
cal levels d while considering several nodes in a com- 
plex network provide a valuable characterization of 
its topology. Generally speaking, d tends do increase 
with the density of connections up to a peak, decreas- 
ing afterwards. At the same time, as will become clear 
along the remainder of this article, the more connected 
the network is, the less hierarchical levels it tends to 
have. It should be also observed that algorithmic im- 
plementation of hierarchy identification, such as those 
reported in |9| and |10| (see also |35|), are typically 
more computationally efficient than the use of the ma- 
trix arithmetic presented in this Section. 



4 Hierarchical Measurements 

The concept of hierarchical level introduced above al- 
lows a natural and powerful extension of traditional 
measurements such as the node degree and clustering 
coefficient. This section defines such features as well 
as ancillary measurements which can be used in order 
to obtain a more complete characterization of complex 
networks. The considered measures can be general- 
ized for weighted networks taking some modifications 
as described along the measures. When considering 
oriented graphs, a new network can be obtained re- 
trieving only the In or Out connections of each node. 

The hierarchical node degree of a reference node i at 
distance d is henceforth defined as corresponding to 
the number of edges extending between the nodes in 
Rdii) and Rd+i{i)- This measurement is henceforth 
represented as kd{i). As an example, in Figure |5] we 
have that fco(8) = 5 (corresponding to the traditional 
node degree) and fci(8) ~ 8. Observe that the hierar- 
chical node degree is not averaged among the number 
of nodes in Rd{i)- Actually, this measurement can be 
understood as the traditional node degree where the 
reference node is understood as corresponding to the 
ball Bd{i) (i.e. the nodes in this ball are merged into 
a subsumed node). This measure can be extended to 
weighted networks by taking the sum of the weight 
values for every connection between these nodes and 
the nodes of the next level. 

Let the number of edges in the subnetwork 7rf(i) be 
expressed as ed{i), and the number of elements of the 
ring Rd{i) be represented as nd{i). The hierarchical clus- 
tering coefficient of node i at distance d, hence ccd{i), 
can be obtained in terms of the immediate generaliza- 



tion of Equation|2| 

ccd{i)^2 '-'^f (8) 
nd{i){nd[i) - 1) 

For node i = 8 in the simple network shown in Fig- 
ureElwe have that cci(8) = 0.3 and cc2(8) « 0.19. 

Other interesting hierarchical measurements which 
can be obtained with respect to the reference node i 
and which can be used to diminish the degeneracy of 
the node degree and clustering coefficient include the 
following: 

Convergence ratio (Cd{i))- Corresponds to the ra- 
tio between the hierarchical node degree of node i at 
distance d and the number of nodes in the ring at next 
level distance, i.e. 

Cdi^) = (9) 
nd+i{i) 

This measurement quantifies the average number of 
edges received by each node in the hierarchical level 
d+1. We have necessarily that Co{i) = 1 for whatever 
node selected as the reference i. In the case illustrated 
in FigureEl we have Co (8) = 1 and Ci(8) ^ 8/7, indi- 
cating a low level of edge convergence into the nodes 
in Rd{i). 

Intra-ring degree {Ad{i)): This measurement is ob- 
tained by taking the average among the degrees of 
the nodes in the subnetwork 7d(j). Observe that only 
those edges between the nodes in such a subnetwork 
are considered, therefore overlooking the connections 
established by such nodes with the nodes in the hier- 
archical levels at d — 1 and d+1. For instance, we 
have for the situation in Figure |2 that ^i(8) = 6/5 
and ^2(8) = 8/7. For weighted networks the value 
of intra-ring is the average of weights of all nodes in 
such subnetwork. 

Inter-ring degree (Ed{i)): The average of the num- 
ber of connections between each node in ring Rd{i) 
and those in Rd+i{i)- For instance, for Figure |2 we 
have Ea{8) = 5, Ei{8) = 8/5 and £2(8) = 0. Observe 
thatEdii) = kd{i)lnd{i). 

Hierarchical common degree (Hd{i))'. The average 
node degree among the nodes in Rd{i), considering 
all edges in the original network. For Figure |2 we 
have Hi{8) ^ 18/5 and H2{8) = 16/7. The hierarchi- 
cal common degree expresses the average node degree 
at each hierarchical level, indicating how the network 
node degrees are distributed along the network hier- 
archies. 

It is also interesting to eventually consider versions 
of the above described measurements considering the 
ball Bd{i), and not the ring Rd{i)- Tabled summarizes 
the hierarchical measurements reviewed /introduced 
in the current article, all of which are defined with 
respect to one of the network nodes, identified by i, 
taken as a reference and at a distance d from that node. 
Observe that most measurements are averaged among 
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and 



ed{i) 


hier. number of edges among the nodes 
in the ring Rd{i) 


nd{i) 


hier. number of nodes in the ring Rd{i) 


kd{i) 


hierarchical degree of node 
i at distance d 


ccd{i) 


hier. clustering coefficient of node 
i at distance d 


Cd{i) 


convergence rate at 
hierarchical level d 


Ad{€) 


intra-ring node degree of node 
i at distance d 


Ed{i) 


inter-ring node degree of node 
i at distance d 


Hd{i) 


hierarchical common degree of node 
i at distance d 



Table 1: The hierarchical measurements considered in 
the current article. 



the number of nodes in Rd{i), except the first three fea- 
tures in Tabled 



1/2(7) = (4,1,1,0,1,0,12,12,11,11,22,11,11)^ 
and, through Equation|6l we obtain 

Pi(7) = (1,1, 1,1, 0,1, 0,0, 0,0, 1,1,1)^ 

and 

P2h) = (1,1,1,1,1,1,1,1,1,1,1,1,1)^. 

The vector specifying the ring centered at 7 at dis- 
tance d = 2 is now obtained by using Equation |7| as 
^2(7) =P2-Pi - (0,0, 0,0, 1,0, 1,1, 1,1, 0,0,0)^, from 
which we finally obtain i?2(7) = {5, 7, 8, 9, 10}. 

The extension of the hierarchical node degree and 
hierarchical clustering coefficient to an edge (instead 
of a node) can now be easily obtained by first identify- 
ing the two nodes i and j defining the edge of interest 
and making the nodes in 7 to correspond to those two 
nodes. The hierarchical node degree and hierarchical 
clustering coefficient can be obtained by using imme- 
diate extensions of their respective definitions. 



5 Edge Degree and Edge Cluster- 
ing Coefficient 

One important thing about the traditional node degree 
and clustering coefficient is that these concepts have 
been defined with respect to a network node and its 
immediate neighbors. It would be interesting to ex- 
tend such concepts with respect to network edges. The 
generalization of the node degree and clustering co- 
efficient to any subset of nodes in a complex network 
reported in |10| provides an immediate means to ob- 
tain the above extensions. 

Such a generalization can be immediately obtained 
by considering more general vectors v{i) in the equa- 
tions in the previous two sections. More specifically, 
instead of assigning the value one only to the vector 
element whose index corresponds to the label of the 
reference node, we assign ones to the elements whose 
indices correspond to the labels of all nodes in the sub- 
network of interest. For instance, in case we define 
the subnetwork 7 as including the nodes {1,11} and 
respective edges in the network in Figure |21 we have 
v{j) = (1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0)^. Let us obtain the 
ring centered at 7 at distance 2. By applying Equa- 
tionOwe have 



1/1(7) = (0,1,1,1,0,1,0,0,0,0,0,11,11)^ 



6 Analytical Results for Random 
Networks 

This section presents a mean-field analytical investi- 
gation of the typical values and behavior of the main 
measurements reviewed /introduced in the previous 
sections of this work. 

Consider the generic situation depicted in Figure 01 
including a reference node i and the several respec- 
tively defined hierarchical levels, extending from 
(corresponding to the reference node) to d, and fur- 
ther. Recall that the subnetwork 7d(«) is the subgraph 
obtained by considering the nd{i) nodes at level d (i.e. 
the ring Rdii)) and the ed{i) edges among those nodes. 
It can be shown that the following mean-field recur- 
sive approximation holds for a random network with 
overall mean degree (fc) 

r nd{i) ~v{kd-i,N - Nd-i) 

J Ndii)~ Ndii)+nd{i) (10) 
[ kdi^)^{^^^) {J:,eRA.)k,)nd{^) 

where Nd{i) is the cumulative number of nodes 
from the hierarchical level up to level d (inclusive), 
i.e. Nd = J2'j=o''^d.{'i), and the function ri{a,b) gives 
the average number of manners b objects can be taken, 
with repetition, to fill a slots. Now, the average and 
variance of the hierarchical node degree of node i at 
distance d can be respectively approximated as 
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Figure 3: A generic situation in a complex network involving a reference node i (in black) and the respectively 
defined hierarchical levels. 



EM^)]^(^ ^ ^''^'^ yk)n,{^) (11) 
Var iUz)} « (^^^Ml^ ' (fc) n4^)' (12) 

Figures|Ha-i) show the hierarchical node degree for 
several combinations of (k) and N. It is clear from this 
figure that the hierarchical node degree curves are ap- 
proximately symmetric with respect to the abscissa P 
of the respective peak value, which is a consequence 
of the finite size of the considered networks. Actually, 
the following three situations can be identified during 
the dynamic evolution of the hierarchical node degree 
for a specific network node: (i) the hierarchical node 
degree increases as more nodes imply links to more 
nodes; (ii) a peak is achieved with abscissa P; and (iii) 
the node degree decreases because of the finite size of 
the network, which implies the 'saturation' of the hier- 
archical expansion. Observe also that higher connec- 
tivity, implied by large values of (fc), tends to reduce 
the value of P and, consequently, the hierarchical lev- 
els of the networks. Such an effect is usually accom- 
panied by an increase of the heights of the respective 
curves, in order to conserve the average node degree. 
As a matter of fact, it can be shown that also important 
is the fact that the standard deviation tends to increase 
with the values of the hierarchical node degree. 

Figure 121 shows the values of P, obtained by simu- 
lations using the EquationEI for several values of (fc) 
and N. Observe that, for a fixed average degree (fc). 
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Figure 4: The hierarchical node degree for several con- 
figurations of (fc) and N. Observe that such curves 
are always characterized by a peak, which is a con- 
sequence if the finite size of the considered networks. 
Observe also that increased connectivity, implied by 
larger values of (k) tends to reduce the number of hi- 
erarchical levels in the network. 
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Figure 5: The values of the abscissa of the peak hierarchical node degree for several values of Log{(k)) and Log{N). 



we have that P « cLog{N), for some real constant c. 
It is clear from Figure |51 that the hierarchical levels are 
much more speedily reduced with the increase of (fc) 
than with the reduction of N, an effect which can also 
be appreciated from FigurelH 

7 Characterization of Complex 
Networks Models 

In order to further illustrate the potential of the hier- 
archical measurements discussed so far in this work, 
they have been used to characterize, through simula- 
tions, random, scale-free (i.e. Barabasi-Albert - BA) 
and regular network models. 

The random networks are generated by selecting 
edges with uniform probability p. The BA networks 
are produced as described in [2\, i.e. starting with 
mO randomly interconnected nodes and adding new 
nodes with m edges which are attached to the existing 
nodes with probability proportional to their respective 
node degrees. The considered regular networks are 
characterized by each node being connected exactly to 

8 other nodes. Two types of networks have been stud- 
ied in this article: one with border effects, where the 
nodes at its border have a lesser degree; and another 
without border effects, i.e. considering toroidal con- 
nections. In both cases, the nodes are organized into 
an L X L array, and each internal node (i.e. non-border 
node), specified by its position (x, y) in such an array, 
is connected to its 8-neighbors {x — l,y), {x + 
(x-l,y-l), {x + l,y-l), (a;-l,?; + l), (x+l,y + l), 
{x,y — 1), {x,y + 1). The random model assumes 
(fc) = 15, (fc) = 5 and (k) = 3, and the BA model 



considers (fc) = 16, (k) = 6 and (fc) = 4. These two 
models assume N — 10000. In the case of the regu- 
lar networks, = 10000 (i.e. L = 100) and (fc) = 8. 
Observe that the average node degree of the regular 
network differs from those for the other two models 
as an unavoidable consequence of the intrinsic topol- 
ogy of that network. 

The remainder of this section presents the hierar- 
chical measurements obtained for each of the complex 
networks types described above. For the sake of com- 
prehensiveness, three instances of each model were 
considered respectively to decreasing average node 
degree, namely k = 15, 5, and 3 for Radom Graph Re- 
sults; fc = 16, 6, and 4 for Barabasi-Albert model. 

Figure |6l shows the hierarchical number of nodes 
(average ± standard deviation) obtained for the con- 
sidered network models, including three average de- 
gree values in the case of the BA and random cases, 
while taking all nodes into account. The asterisks in- 
dicate the position of the average shortest path be- 
tween any pair of nodes, which are included in order 
to provide a reference for the hierarchical analysis. All 
curves are characterized by a peak, except for the reg- 
ular graph with no border effects. The values of the 
hierarchical number of nodes obtained for the random 
models are more susceptible to the change of mean de- 
gree (i.e. Figs.|6^-c) than those values obtained for the 
Barabasi-Albert networks. For a decrease from fc = 16 
to fc = 6, the peak of the Barabasi-Albert model shows 
a change of only one hierarchical level, while in the 
Random model, decreasing from k = 15 to fc = 5, 
such a displacement involves four levels. For a re- 
duction from k — 5 to k — 3(fc — 6 to k — 4 
for BA), the change was one level for Barabasi-Albert, 
and 3 levels for the random models. This is a direct 
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Figure 6: Hierarchical number of nodes (average 
± standard deviation) for all considered networks, 
which are identified above each graph. Observe that 
most curves are characterized by a peak. The average 
value of the shortes path between any two nodes is 
marked by an asterisk. 



consequence of the fact that scale-free structures are 
less susceptible to the removal of random edges (same 
as reducing the mean degree) than the random mod- 
els. Hubs in BA model establish shortcuts between 
nodes, reducing the weight of other edges distances in 
the average minimal distance. The regular networks 
without border effects yielded hierarchical number of 
nodes which are linearly increasing, reflecting the ba- 
sic structure of such models. However, the regular net- 
works with borders were characterized by a wide peak 
and high variance of measurements. Interestingly, the 
peaks obtained for the hierarchical number of nodes 
occur near the average shortest path marked by the as- 
terisks. Note that the last level with a non-zero value 
corresponds to the graph diameter [16 1. 

The values of hierarchical node degrees, shown in 
Figure |7| are similar to the respective measurements 
of hierarchical number of nodes shown in Figure|6| ex- 
cept for an expected offset of one hierarchical level to 
the left. 

All curves obtained for the inter-ring degree, shown 
in Figure |S1 are monotonically decreasing after the 
first hierarchical level. Again, the curves obtained 
for the random networks are less sensitive to varia- 
tions of the average degree than those obtained for 
the Barabasi-Albert model. The results for the ran- 
dom netwoks show wider and smoother curves, while 
those obtained for Barabasi-Albert tend to be sharper 
and to concentrate on the left hand side, implying 
smaller peaks abscissae which are identical for the 
three considered average degrees. Results obtained 
for the Barabasi-Albert cases also show a peak at the 
first hierarchical level and present high variance, this 
is a consequence of the high chance of finding a hub 
on that level. All models, except for the regular cases, 
are characterized by presenting the peak of the curve 
to the left of the asterisk (i.e. the average shortest 
path). It is also interesting to observe that although 
this measurement is closely relate to the hierarchical 
degree, the curves obtained for these two features (i.e. 
Figures 13 and O are markedly different, in the sense 
that the curves of the hierarchical inter ring degree ob- 
tained for the random model no longer presents the 
peak structure as observed in Figure|7| The curves ob- 
tained for the regular networks are also interesting, be- 
ing characterized by an initial stage of steep decay fol- 
lowed by a plateau which tends to decrease for higher 
hierarchical levels in the case of the regular network 
with borders. 

The results for intra-ring degree, shown in Figure 
are very similar to the hierarchical number of nodes 
measurement, characterized by a peak, except for reg- 
ular networks, which exhibit a markedly different evo- 
lution resembling the curves obtained for the inter- 
ring degree. Note that for regular graphs with no bor- 
der effects, the final decreasing part tends to decrease 
and saturate. The shape of BA and Random curves 
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Figure 7: Hierarchical node degrees obtained for all 
the considered network models. The curves are sim- 
ilar to those obtained for the hierarchical number of 
nodes, except for a expected offset of one level. 
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network models. 
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Figure 9: Intra Ring Degree values for the considered 
network models. 



are closely similar to those obtained for the hierarchi- 
cal number of nodes. 

Figure^] shows the values of hierarchical common 
degree for the considered network models. These dis- 
tributions are characterized by a decreasing curve af- 
ter the first level, excepted for the regular graphs with 
no border effects. Generally, these curves are similar to 
those obtained for the inter-ring degrees, except that 
the present curves are wider. Another observation is 
that the average hierarchical conmion degree tends to 
be higher at the initial hierarchical levels, which is a 
consequence of the fact that the largest hubs present 
in the BA model tend to be reached sooner, providing 
bypasses to the other nodes and therefore reducing the 
peak abscissae and number of hierarchical levels. This 
is the main reason why all peaks in the BA networks 
tend to be displaced to the left hand side than those in 
the random networks. Like with the other measure- 
ments, it can be that the positions of the peaks along 
the curves are less affected by variations of the aver- 
age node degree in the cases of the BA models. The 
curves for random and regular models resulted simi- 
lar and characterized by an interval of nearly constant 
values at the intermediate part of the curves. This is 
a direct consequence of the smaller variance of tradi- 
tional node degrees in those two models as compared 
to the higher variance of the BA cases. 

Because the regular models have a fixed number of 
connections for each node, the common degree mea- 
surement results in a constant curve with value fc = 8 
for the network with border effects. As some nodes 
(i.e. those at the border) do not have exactly the same 
degree, the last levels have a smooth decrease but 
higher standard deviation. 

The curves of hierarchical clustering coefficients re- 
sulted the most distinct among the three considered 
networks and have the higher standard deviations, 
as shown in Figure E] Also involving an interme- 
diate constant interval, the curves obtained for the 
random models correspond to the smallest cluster- 
ing coefficients among the models. Therefore, the 
nodes at each ring of those networks are character- 
ized by low interconnectivity. The hierarchical cluster- 
ing coefficient curves obtained for the BA case, present 
much higher values and involve sharper peaks of con- 
nectivity, tending to present another peak along the 
last levels (see Figures ITTb-c). The hierarchical clus- 
tering coefficient obtained for the regular networks 
has a monotonically decreasing behavior, with values 
which start higher than those of the two other con- 
sidered models. The monotonic decay observed for 
this case (i.e. regular networks) is explained by the 
fact that both the number of nodes and edges increase 
linearly along successive hierarchical levels for that 
model (see Equation jS). Note that the regular model 
with and without border are similar. 

The convergence ratios obtained for each of the con- 
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Figure 10: Hierarchical Common Degree measures Figure 11: Hierarchical Clustering Coefficient Degree 
with the respective ± standard deviations obtained for measurements. Note the higher values of standard de- 
the considered models. viantion relatively to the other measurements. 
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Figure 12: Convergence Ratio measurements for the 
considered networks. 



sidered network models are shown in Figure^] These 
curves are characterized by similar behavior among 
themselves with nearly constant value at the first lev- 
els and a peak at the last levels (except for the regular 
models), along which the hierarchical expansion tends 
to saturate, i.e. after the peak P is reached. Note also 
that sharper peaks tend to be obtained for high values 
of fc. The positions of the peaks are near the average 
shortest paths. 

The convergence ratio curves obtained for the regu- 
lar networks are also qualitatively similar to those ob- 
tained for the other models, but the bordered graphs 
lack the peak and have a smooth decay along the last 
levels. 

Interestingly, among all considered measurements, 
it was the hierarchical common degrees and hierar- 
chical clustering coefficients which provided the most 
distinctive shapes for each respective network model. 
Therefore, such measurements stand out as particu- 
larly promising subsidies for, together with the log-log 
node degree density, identifying the category of the 
network under study. Such a possibility is illustrated 
in the following section. 



8 Application to Real Networks 

The above described hierarchical measurements have 
also been applied to characterize three complex net- 
works obtained from real data. These real networks 
include: a Edinburgh Associative Thesaurus net- 
work 1 39 1, the 1997 US Airports network ( |40|) and 
a protein-protein interaction network |31|. The Edin- 
burgh(Word) graph is a empirical association network 
created as a set of collected words from human sub- 
jects who are requested to enter words that first come 
to their mind after seeing a stimulus word. All the re- 
sponses are presented with similar frequency. The de- 
tailed procedures of the creation of Edinburgh graphs 
can be seen in |39|. This network has 23219 nodes 
and ~ 14 and is oriented and weighted. A similar 
network has been considered in 1 36 1, while a prelim- 
inary characterization of such a type of networks by 
using the hierarchical node degree has been reported 
in 1 9 1 . The protein-protein interaction graph(YEAST), 
described in ISTI , has 2361 nodes with fc ~ 3 where 
a node represents a protein and the edge a interaction 
between the two respective proteins. The US Airport 
network is a compilation of flights between the air- 
ports of United States in 1997, where a node represents 
an airport and the edge a flight between the two air- 
ports. This network has a total of 332 nodes(airports) 
and fc ~ 6.4. All the considered real graphs were com- 
pared to random and BA models with similar node 
degrees (for the sake of space economy, not all these 
graphs are shown in Section Characterization of Com- 
plex Networks Models). 
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Figure 13: Hierarchical Number of Nodes obtained 
for the real networks and considered generated net- 
works. 
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Figure 15: Inter Ring Degree values for real and gen- 
erated graphs. 
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Figure 14: Hierarchical Node Degree distribution 
along hierarchical levels, same results from Number 
of Nodes. 



Figure 16: Intra Ring Degree measurements obtained 
for the considered networks. 



The results for the curves of hierarchical number of 
nodes and node degrees are similar as seen in Figure 
[T3I and El Also, no significant differences were ob- 
served between these results and those obtained for 
the respective random or BA simulated networks. 

More interesting results have been obtained for the 
inter-ring degrees, shown in Figure El These curves 
were observed to be more similar to the respective 
simulated Barabasi- Albert curves. In fact, all consid- 
ered real networks are substantially similar to scale- 
free networks, being characterized by a high variance 
of node degrees and the presence of hubs. 

The intra ring degrees of the real networks are 
shown in Figure [161 Interestingly, the curves obtained 
for the airport (b) and yeast(c) present their respective 
peaks to the left of the average shortest path (the aster- 
isk position), while in the BA models the peaks tend 
to coincide with the asterisks as obtained for the word 
network. 

Figure El shows the measurements of hierarchical 
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Figure 17: Hierarchical Common Degree Coefficient of 
real networks. 



Figure 19: Convergence Ratio Degree of real net- 
works. 



Hier. Clustering Coefficient (Word) 




Figure 18: Hierarchical Clustering Coefficient mea- 
sures. 



common degree. The airport(b) and yeast(c) networks 
curves have a similar behavior to those obtained for 
the respective BA curves, with a peak at the first hi- 
erarchical level and a decay. However the word net- 
work(a) have a mixed behavior, beginning with a in- 
creasing curve like in a BA model, but ending with a 
convex decay like that typically observed in random 
networks. 

The clustering coefficient measurements, shown in 
Figure substantiate the adherence of the real net- 
works with respective BA simulated models. Another 
interesting result which can be inferred from this fig- 
ure regards the fact that the hierarchical clustering co- 
efficients are wider and higher for the word (a) than 
for the respective BA simulations. 

Figure 09] shows the convergence ratio measure- 
ment values, which yielded the most different curves 
among the three real networks and among these and 
the respective models. The curve for the word net- 
work (a) is more similar to the BA and random model. 



being characterized by a low plateau followed by a 
peak and an abruptly decrease along the last levels. 
Different curve profiles have been obtained for the air- 
port (b) and yeast curves (c). The yeast curve presents 
a wider peak, whose position falls near the center of 
the distribution. The peak of curve obtained for the 
airport network resulted displaced to the left hand 
side, far away from the average shortest path. This 
is a consequence of the fact that, differently of what 
is obtained for the yeast, the hubs are reached after 
just a few hierarchical levels while starting from most 
nodes. Indeed, we have verified experimentally that 
the position and width of the peak of the convergence 
ratio is ultimately defined by the distribution of hubs 
along the hierarchies after starting from individual 
nodes. Therefore, the relatively narrow peak near the 
intermediate hierarchical levels obtained for the word 
network indicates that the hubs in this structure are 
found, in average, after 3 to 5 hierarchical levels. Al- 
though also narrow, the peak of the airport network 
results in the first levels, where most hubs are concen- 
trated. Finally, the wider peak obtained for the yeast 
network is a consequence of the fact that the hubs are 
distributed more evenly along the hierarchical levels. 

9 Node Categorization through Hi- 
erarchical Clustering 

Another possibility for analysis of complex network 
allowed by the consideration of hierarchical measure- 
ments is the classification of individual nodes into 
similar groups. In order to illustrate such a poten- 
tial for the characterization of nodes, two complex net- 
work are considered, a Barabasi- Albert model (gener- 
ated with TV = 332 nodes and fc ~ 6 edges) and the 
airport network with 332 nodes and /c ~ 6.4 edges con- 
sidered in the last section. This analysis focuses on the 
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Figure 20: Dendrogram obtained for the BA model 
considering the hierarchical clustering coefficients of 
the nodes up to hierarchical level 5. Starting from the 
righthand side of the tree, the nodes are progressively 
merged into clusters in terms of their similarity. 

clustering coefficient measurement, which is obtained 
for all nodes of such networks. Only the hierarchical 
levels up to 5 are considered in this example (the use 
of additional levels tended to reduce the specificity of 
the obtained measurements in the case of the real net- 
works considered in this section). 

The hierarchical clustering coefficients are calcu- 
lated as usual and supplied to a hierarchical cluster- 
ing method |14|, namely an agglomerative algorithm, 
resulting in the trees (also called dendrograms) of mea- 
surements shown in figure l20l and figure l2Tl respec- 
tively to the BA and airport networks. For the sake 
of better visualization, only the four first hierarchi- 
cal levels are shown in these figures. The x-axes in 
these two three refer to the similarity between nodes. 
Starting at the right hand side of the tree, the nodes 
are merged with basis on the similarity of their hierar- 
chical clustering coefficients, yielding the taxonomical 
categorization of the nodes into meaningful clusters 
corresponding to each branching point in the tree. The 
y-axes express the size the clusters at the third hierar- 
chical level. For instance, the cluster at the top of Fig- 
ure |20| contains substantially less nodes than the third 
cluster from the bottom of the figure. bb=0 576 164 

Figures |22| and |23l show the graphs of average ± 
standard deviation of the hierarchical clustering coef- 
ficients obtained at each respective level in the den- 
drograms. The mean degree and percentage of nodes 
with respect to the whole network for each cluster are 
given above each graph. Unlike the dendrograms in 



Figure 21: Dendrogram obtained for the airport net- 
work considering the hierarchical clustering coeffi- 
cients of the nodes up to hierarchical level 5. 

Figures |5n| and 121 the x-axes of the trees in Figures El 
and 221 do not consider the level of similarity between 
the groups, which is done for the sake of better vi- 
sualization of the graphs obtained for each cluster of 
nodes. Starting from the whole network cluster at the 
right-hand side of the tree, we can observe the pro- 
gressive division of the node hierarchical signatures 
in terms of subclasses sharing the basic patterns of hi- 
erarchical clustering coefficient shown in the respec- 
tive graphs. Such a taxonomical characterization of 
the nodes into subclasses provides substantially more 
discrimination and characterization than the graphs 
of average ± standard deviation obtained consider- 
ing the whole network such as those discussed in the 
previous section. This enhanced potential of node dis- 
crimination and characterization provided by the den- 
drograms are particularly useful in the case of net- 
works exhibiting the small world property, as such 
cases tend to produce hierarchical signatures extend- 
ing over relatively few hierarchical levels. 

10 Concluding Remarks 

This article has addressed, in a didactic and compre- 
hensive fashion, how a set of hierarchical measure- 
ments can be used for the characterization of impor- 
tant topological properties of complex networks. Mo- 
tivated by the concept of extended neighborhoods 
and distances, the identification of hierarchical lev- 
els along the network, with reference to each of its 
nodes, allows the definition of a series of useful and 
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Figure 22: Graphs of the average ± standard deviation of the hierarchical clustering coefficient obtained for the 
BA model. Each graph corresponds to the clusters of nodes obtained in the four first hierarchical levels of the 
dendrogram in Figure EUl 
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informative hierarchical measurements of the network 
topology, including hierarchical extensions of the tra- 
ditional node degree and clustering coefficient mea- 
surements. The novel concepts of inter and intra- 
ring degrees, convergence ratio, edge degree and edge 
clustering coefficient, as well as their hierarchical ver- 
sions, were also introduced here in terms of the sub- 
network generalization described in fW\. 

It has been shown, both analytically and through 
simulations, that the hierarchical node degree of a ran- 
dom network has a typical shape involving a limited 
number of hierarchical levels while a peak is observed 
at its intermediate portion, which is a consequence of 
the finite size of the considered networks. A simi- 
lar dynamics was experimentally identified for scale- 
free and regular network models. It was also shown, 
through simulations, that the suggested set of hierar- 
chical measurements provided a wealthy of informa- 
tion about the topological structure of the considered 
models (namely random, scale-free and regular), al- 
lowing the identification of a number of interesting 
properties specific to each of those models. Of particu- 
lar interest is the discriminative potential of the hierar- 
chical common degree and hierarchical clustering co- 
efficient. The potential of the reported set of hierarchi- 
cal measurements was further illustrated with respect 
to three real networks: word associations, airport con- 
nections and protein-protein interactions. The com- 
parison of the hierarchical measurements obtained for 
these three networks with respective random, regular 
and BA models with the same number of nodes and 
similar node degree indicated that, except for a few 
measurements (specific to each model), all the three 
real networks were most similar to the BA models. 
In the case of the word associations network, some 
measurements (i.e. hierarchical common degree and 
inter-ring degree) yielded hierarchical curves which 
were similar to random along some parts and simi- 
lar to BA at other parts. This network was also ver- 
ified to present the convergence ratio most similar to 
that of a respective BA model. The concentration of 
higher values of convergence ratio at the left hand side 
of the curves obtained for the airport network also 
confirmed the fact that the hubs in this network are 
reached much faster than all the other networks con- 
sidered in this article. Contrariwise, the convergence 
ratio values obtained for the protein-protein interac- 
tion network indicated that the hubs in this real net- 
work are more evenly spaces one another. As a matter 
of fact, the convergence ratio resulted in the most in- 
formative of the hierarchical measurements as far as 
the analysis of the three real models was concerned. 
This is a consequence of the fact that the presence of a 
hub at a given hierarchical level tend to strongly affect 
the convergence ratio at that level. 

Finally, the current article also proposed and illus- 
trated the possibility to use the enhanced information 



provided by the set of hierarchical measurements in 
order to organize the nodes of a network into a taxon- 
omy reflecting the similarities between the nodes con- 
nectivity. Such a methodology is particularly promis- 
ing because the obtained taxonomy can be used to bet- 
ter understand the main classes of nodes present in a 
given complex network, i.e. those classes obtained at 
the higher levels of the taxonomy. Indeed, while the 
limited number of hierarchical levels present in small 
world networks such as random and BA models con- 
strain the potential of the hierarchical measurements 
for the discrimination between such models, the con- 
sideration of the main obtained classes of nodes has 
been verified to provide further discrimination be- 
tween the compared networks. 

A series of possible future investigations has been 
motivated by the results reported in this article. First, 
it would be interesting to assess in a systematic fash- 
ion, and by using multivariate statistical analysis and 
hypothesis tests, the potential of each measurement, 
as well as their combinations, for discriminating the 
possible class of a given network. Another issue 
of particular relevance regards the identification and 
preservation of hubs considering not only the immedi- 
ate neighbors of a node, but of the neighbors accumu- 
lated along growing hierarchical levels. While such 
a possibility has been preliminary considered in |101, 
it would be interesting to consider the preservation 
of hubs as an increasing number of hierarchical lev- 
els is taken into account. Such a study is under de- 
velopment with respect to protein-protein association 
networks and related results should be futurely pre- 
sented. Another study which can complement the re- 
sults described in the current work involves the con- 
sideration of several types of small- world networks. 
Finally, it would be interesting to apply the hierarchi- 
cal measurements for the characterization of several 
other real networks such as protein-protein interac- 
tion, internet, social connections, to name but a few. 
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